Health informatics sits at the vibrant intersection of medicine, data science, and technology, transforming how we store, analyze, and utilize health information. This rapidly evolving field empowers clinicians and researchers to uncover patterns in patient data, improve diagnostic accuracy, and personalize treatment plans without getting lost in complex databases. By turning raw medical records into actionable insights, these innovations are reshaping the future of healthcare delivery and population health management.

At Gist.Science, we bridge the gap between cutting-edge research and public understanding by curating the latest preprints from medRxiv specifically within this domain. Our team processes every new submission in this category, providing both accessible plain-language explanations and detailed technical summaries to ensure the science is clear for everyone, from policymakers to curious readers. Below are the latest papers in health informatics, freshly distilled and ready for you to explore.

Development of a natural language processing application to extract and categorize mentions of violence from mental healthcare records text

This study developed and validated a multi-label BERT-based natural language processing application that successfully extracts and categorizes various forms of violence, patient roles, and contextual details from unstructured mental health records, achieving high performance on most features except temporal aspects.

Li, L., Sondh, S., Sondh, H. K., Stewart, R., Roberts, A.2026-03-26📄 health informatics

A statistical framework for evaluating the repeatability and reproducibility of large language models

This paper presents a regulatory-informed statistical framework that quantifies the semantic and internal repeatability and reproducibility of large language models, demonstrating that these metrics vary significantly based on prompting strategies and model configurations, are often independent of diagnostic accuracy, and are essential for systematically evaluating LLM reliability in biomedical applications.

Shyr, C., Ren, B., Hsu, C.-Y., Yan, C., Tinker, R. J., Cassini, T. A., Hamid, R., Wright, A., Bastarache, L., Peterson, J. F., Malin, B. A., Xu, H.2026-03-25📄 health informatics

Human-supervised, large language model-based clinical decision support aligned to national newborn protocols in Kenya: a pragmatic, early-stage evaluation

This study presents a pragmatic evaluation of AIFYA, a human-supervised large language model-based clinical decision support system aligned with Kenya's national newborn protocols, demonstrating successful implementation, high expert-rated accuracy, and strong user adoption in low-resource public health facilities.

Kuria, T., Kamau, G., Makokha, F., Omondi, P., Mbugua, G., David, K., Mbugua, S., Gitaka, J.2026-03-25📄 health informatics

Medical errors in large language models revealed using 1,000 synthetic clinical transcripts

This study reveals that despite achieving high diagnostic accuracy on full histories, large language models exhibit critical safety failures—including discouraging essential investigations and inappropriately downgrading triage for life-threatening conditions, with significant gender disparities—when evaluated against 1,000 synthetic clinical transcripts that simulate real-world medical complexity.

Auger, S. D., Scott, G.2026-03-25📄 health informatics

The Power of Open Health Data: Impact, Representation, and Knowledge Diffusion

This study evaluates four major open health data repositories using a novel two-degree citation methodology to reveal that while open data consistently generates a ~10x indirect citation amplification across vastly different funding levels, significant disparities in global representation and persistent gender gaps in senior authorship highlight that data access alone cannot address structural inequities in research leadership.

Gorijavolu, R., Armengol de la Hoz, M. A., Bielick, C., Cajas, S., Charpignon, M.-L., El Mir, A., Gichoya, J. W., Kwak, H. G., Madapati, K., Mattie, H., McCullum, L., Mwavu, R., Nair, V., Nakayama, L. (…)2026-03-24📄 health informatics

Social Determinants of Health and Chronic Disease Risk Prediction in the All of Us Research Program

Analyzing data from nearly 260,000 participants in the All of Us Research Program, this study demonstrates that integrating social determinants of health with demographics significantly improves chronic disease risk prediction, revealing that mental health outcomes are primarily driven by experiential factors like stress and discrimination, whereas cardiometabolic conditions are more strongly influenced by structural neighborhood characteristics, thereby supporting the adoption of condition-specific social screening and targeted interventions to reduce health disparities.

Kammer-Kerwick, M., Dave, Y., Parekh, V., McDonald, L., Watkins, S. C.2026-03-23📄 health informatics

Impact of a Social Media Derived Digital Self Management Platform on Population Level Irritable Bowel Syndrome Emergency Utilization: A Controlled Interrupted Time Series Analysis Using South Korean National Health Insurance Data

This study demonstrates that a social media-informed digital self-management platform, "Jang Geongang," significantly reduced population-level irritable bowel syndrome emergency department visits and unplanned hospitalizations in South Korea, particularly among younger adults and those with the diarrhea-predominant subtype, as evidenced by a controlled interrupted time series analysis of national health insurance data.

Park, J.-H., Lim, A.2026-03-23📄 health informatics

Automated Extraction of Cancer Registry Data from Pathology Reports: Comparing LLM-Based and Ontology-Driven NLP Platforms

This study demonstrates that an LLM-based platform (Brim Analytics) achieves high accuracy and efficient processing for extracting cancer registry data from pathology reports, outperforming an ontology-driven system (DeepPhe) particularly in T stage classification across pancreatic and breast cancer cases.

McPhaul, T., Kreimeyer, K., Baris, A., Botsis, T.2026-03-23📄 health informatics